-
-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed a bug in the Lua 5.1 bytecode pattern #298
base: master
Are you sure you want to change the base?
Conversation
Lua 5.1 bytecode upvalue fields were incorrectly defined as a `Vector<u32>` instead of `Vector<LuaString>`.
I know very little about Lua byte code, but having separate patterns for each version of Lua makes dealing with problems like this a pain. Instead of checking every other Lua to see if the error exists there I unified all the Lua versions into one. The resulting pattern has this definition for the upValues struct UpValues {
if (minorVersion == 1)
Vector<u32> upvalues;
else
Vector<LuaString> upvalues;
}[[inline]]; Not only shows a potential error but also suggests what the fix would be and all of this without having to visit each and every version there is. I am attaching the unified version (change the extension that github cant deal with to .hexpat) I wrote which I tested successfully with all the tests for lua versions ImHex has, but probably needs more rigorous tests given that this error was not found. |
As far as I'm aware lua does not use u32s or integers of any kind for it's upvalue vectors. I don't know of any lua version that does. The upvalue vector is only there for debug information. |
Lua 5.1 bytecode string sizes were incorrectly parsed as a 64-bit integer, instead of a 32-bit integer.
Just added a commit for a separate similar bug. I forgot to include this in the initial pull request. |
I put a link to the unified version of the lua pattern. Did you look at it? Ill post it here in case you cant get the link #pragma description Lua 5.1, 5.2, 5.3 and 5.4 bytecode
import std.io;
import std.mem;
import type.base;
u8 minorVersion = std::mem::read_unsigned(4,1)&15;
namespace impl {
fn transform_Size_array(auto array) {
u128 res = 0;
for(u8 i = 0, (array[i] & 0x80 == 0) && (i < 9), i+=1) {
res <<= 7;
res |= array[i] & 0x7f;
}
res <<= 7;
res |= array[sizeof(array)-1] & 0x7f;
return res;
};
fn format_Size(auto leb128) {
if (minorVersion == 4) {
u128 res = impl::transform_Size_array(leb128.array);
return std::format("{} ({:#x})", res, res);
} else
return std::format("{}",leb128.size);
};
fn format_StringSize(auto leb128) {
if (minorVersion == 4) {
u128 res = impl::transform_Size_array(leb128.size.array);
return std::format("{} ({:#x})", res, res);
} else
return std::format("{}",leb128.size);
};
fn transform_Size(auto leb128) {
if (minorVersion == 4)
return impl::transform_Size_array(leb128.array);
else
return leb128.size;
};
fn transform_StringSize(auto leb128) {
if (minorVersion == 4)
return impl::transform_Size_array(leb128.size.array);
else
return leb128.size;
};
fn format_LuaString(auto string) {
if (string.size == 0) {
return "None";
}
return std::format("\"{}\"", string.data);
};
fn format_Constant(auto constant) {
return constant.data;
};
fn format_Version(auto ver) {
return std::format("Ver. {}.{}", ver.major, ver.minor);
};
}
using LuaFunction;
struct Size {
if (minorVersion == 4)
u8 array[while(addressof(this) == $ || ((addressof(this)-$ < 9) && (std::mem::read_unsigned($-1, 1) & 0x80 == 0)))] [[hidden]];
else
u32 size[[hidden]];
} [[sealed, format("impl::format_Size"), transform("impl::transform_Size")]];
struct LuaStringSize {
if (minorVersion == 4)
Size size;
else if (minorVersion == 3)
u8 size;
else
u64 size;
}[[sealed, format("impl::format_StringSize"), transform("impl::transform_StringSize")]];
bitfield Version {
minor : 4;
major : 4;
} [[format("impl::format_Version")]];
struct LuaBinaryHeader {
char magic[4];
Version version_number;
u8 format_version;
if (minorVersion == 3 || minorVersion == 4)
char LUAC_DATA[6];
else
u8 endianness;
u8 size_of_int;
u8 size_of_size_t;
match (minorVersion) {
(1 | 2) : {
u8 size_of_Instruction;
u8 lua_Number;
u8 is_lua_Number_integral;
if (minorVersion==2)
char LUAC_TAIL[6];
}
(3 | 4) : {
if (minorVersion == 3) {
u8 size_of_Instruction;
u8 size_of_lua_Integer;
}
u8 sizeof_lua_Number;
type::Hex<u64> LUAC_INT;
double LUAC_NUM;
}
}
};
struct LuaString {
LuaStringSize size;
bool correction = (minorVersion == 3 || minorVersion == 4);
if (size > 0)
char data[size-correction];
}[[format("impl::format_LuaString")]];
struct SourceOnDebug {
if (minorVersion == 2)
LuaString source;
}[[inline,format("format_source_on")]];
struct SourceOnFunction {
if (minorVersion != 2)
LuaString source;
}[[inline,format("format_source_on")]];
fn format_source_on(auto v) {
return std::format("{}",v.source);
};
enum LUA : u8 {
TNIL = 0,
TBOOLEAN,
TNUMFLT = minorVersion==4 ? 3 : 0x13,
TSHRSTR = 4,
TNUMINT = minorVersion==4 ? 0X13 : 3,
TLNGSTR = 0x14
};
struct LuaConstant {
LUA type;
match (type) {
//(LUA::TNIL) : NULL
(LUA::TBOOLEAN) : u8 data;
(LUA::TNUMFLT) : double data;
(LUA::TNUMINT) : u64 data;
(LUA::TSHRSTR | LUA::TLNGSTR) : LuaString data;
}
}[[format("impl::format_Constant")]];
struct LuaUpvalue {
u8 instack;
u8 idx;
if (minorVersion == 4)
u8 kind;
};
struct Vector<T> {
Size size;
if (size > 0)
T values[size];
};
struct AbsLineInfo {
Size pc;
Size line;
};
struct LocalVar {
LuaString varname;
Size startpc;
Size endpc;
};
struct LineInfo {
if (minorVersion == 4) {
Vector<s8> lineInfo;
Vector<AbsLineInfo> abslineinfo;
} else
Vector<u32> lineInfo;
}[[inline]];
struct UpValues {
if (minorVersion == 1)
Vector<u32> upvalues;
else
Vector<LuaString> upvalues;
}[[inline]];
struct LuaDebugInfo {
SourceOnDebug source;
LineInfo lineInfo;
Vector<LocalVar> local_vars;
UpValues upvalues;
};
struct LuaCUP{ //Constants,Upvalues and Protos
Vector<LuaConstant> constants;
if (minorVersion == 1 || minorVersion==2) {
u32 sizep; // size of proto
if (sizep > 0) {
LuaFunction protos[sizep];
}
}
if (minorVersion != 1)
Vector<LuaUpvalue> upvalues;
if (minorVersion == 3 || minorVersion == 4)
Vector<LuaFunction> protos;
}[[inline]];
struct LuaFunction {
SourceOnFunction source;
Size line_defined;
Size last_line_defined;
if (minorVersion == 1)
u8 nups; // number of upvalues
u8 number_of_parameters;
u8 is_vararg;
u8 maxstacksize;
Vector<u32> code;
LuaCUP cup;
LuaDebugInfo debugInfo;
};
struct LuaFile {
LuaBinaryHeader header;
if (minorVersion == 3 || minorVersion == 4)
u8 sizeupvalues;
LuaFunction func;
};
LuaFile file @ 0; |
I just checked the sample Lua 5.1 sample that is used in unit tests and the string is preceded by a 64 bit integer holding its size. The change that you propose turns a successful test into a failing one of reading past the end of the file. Make sure to test the changes on actual files and if you think the ones we use are invalid provided valid ones. |
I did test the patterns before making these changes. After looking into how lua defines size_t, the test samples were created from a lua 64 bit compiler, where as I was using a 32 bit compiler. The unified pattern you sent does seem to account for size_t's variable size, so I guess this was an issue specific to the lua 5.1 pattern. I'll update my pull request to account for size_t's variable size. |
I don't follow. The unified pattern is exactly equivalent to each of the separate versions, or at least , it aims to be. If one of the versions has a problem, then the unified version will have the exact same problem. For example, for string size type it defines: struct LuaStringSize {
if (minorVersion == 4)
Size size;
else if (minorVersion == 3)
u8 size;
else
u64 size;
}[[sealed, format("impl::format_StringSize"), transform("impl::transform_StringSize")]]; So both versions 1 and 2 use u64 like the 5.1 and 5.2 versions do. Im looking into getting better samples so we can test things better. |
Sorry I've made a mistake again.😅The unified version does not account for 32-bit vs 64-bit compilers, and assumes 64-bit. I probably should have read through it more thoroughly. |
Also I'm not sure how I'd easily make either the unified version or the 5.1 version account for this. Ideally |
Updated the pattern to allow for 32-bit and 64-bit integers for the size of the string length field.
something like this should work: u8 global_size_of_size_t=4; we assign a default just in case. Then inside header, right after size_of_size_t is created global_size_of_size_t = size_of_size_t; finally on StringSize struct LuaStringSize {
if (minorVersion == 4)
Size size;
else if (minorVersion == 3)
u8 size;
else {
if (global_size_of_size_t == 4 )
u32 size;
else
u64 size;
}
}[[sealed, format("impl::format_StringSize"), transform("impl::transform_StringSize")]]; I am showing the code in unified pattern but it should be the same for 5.1. What about the other versions? |
That could work, I've managed to implement it differently. You can review that and see which works better. |
Why are you only fixing lua 5.1? Are the other Luas also not possibly erroneous? Considering that we have to deal with 4 versions, there should be an attempt to keep them similar to each other to help debug issues that deal with more than just one version, so breaking structures that are kept together in the other versions doesnt seem like a good idea. The main issue is why don't you think it is a good idea to get rid of the version files and use one instead? That avoids code duplication, reduces code management, makes it more clear what is the same and what isn't across versions and helps showcasing the pattern language ability to deal with special cases among other things. What are the downsides? |
A better solution than both u8 stringSize = std::mem::read_unsigned((minorVersion>2) ? 13 : 8,1); eliminates the need to copy from the pattern to a global or the need to break the main placed variable into pieces. |
We should combine the versions, I didn't do that because it's not what the original issue was trying to fix. But yeah absolutely we should. I also have lots of experience with lua 5.1 bytecode, but haven't looked into the other versions much. |
Lua 5.1 bytecode upvalue fields were incorrectly defined as a
Vector<u32>
instead ofVector<LuaString>
.